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Introduction 

In  this  note,  we  shall  argue  that  mutual  appreciation  of  human 
fallibility  can  resolve  the  Prisoners'  Dilemma  that  so  starkly  describes 
many  conflict  situations.  To  do  so,  we  shall  suggest  a  new 
interpretation  of  an  old  idea:  successive  elimination  of  dominated 
strategies.  We  shall  compare  it  to  other  solutions  which  reflect 
individual  recognition  of  human  fallibility,  such  as  perfect  and 
sequential  equilibrium.  Then  we  shall  apply  it  to  a  repeated  Prisoners' 
Dilemma  in  which  players  are  allowed  to  base  their  actions  only  on  the 
outcome  of  the  previous  round  of  play.  While  the  results  depend  on 
whether  players  are  allowed  to  react  to  their  own  previous  moves  or  not, 
we  shall  find  that  the  stubbornly  noncooperative  behavior  that 
characterizes  the  one  shot  game  is  ruled  out  by  the  solution  we  propose. 
Most  "resolutions"  of  Prisoners'  Dilemma  either  include  such 
noncooperative  behavior  as  one  of  the  possible  outcomes,  or  are  able  to 
ensure  cooperation  only  by  the  use  of  ad  hoc  beliefs  or  adjustment 
mechanisms  specifically  designed  to  favor  cooperation.  The  "wide 
equilibrium"  concept  we  shall  propose  is  largely  free  of  arbitrary 
specifications . 


There  are  two  fundamental  ideas  behind  our  proposal.  The  first  is  the 
individual  recognition  of  human  fallibility,  and  the  second  is  that  this 
fallibility  will  not  only  be  recognized,  but  will  become  common 
knowledge.  -After  brief  discussions  of  these  ideas,  we  give  formal 
definitions^Apd  discuss  the  relation  between  the  concept  of  trembling 
hand  perfect  equilibrium,  which  is  based  on  individual  recognition  of 
human  fallibility ,v  and  wide  equilibrium,  or  equilibrium  in  sequentially 

undominated  strategies.  The  final  section  applies  this  solution  to  the 

\ 

Prisoners'  Dilemma.  \ 
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A.  Individual  Recognition  of  Human  Fallibility 

In  noncooperative  games,  the  issues  of  credibility  and  fallibility  are 
closely  connected.  In  a  game  in  tree  or  extensive  form,  players  are 
making  threats  or  promises  when  they  declare  their  strategies.  A 
particular  n-tuple  of  strategies  is  in  equilibrium  if  no  player  can 
unilaterally  change  his  strategy  in  such  a  way  as  to  improve  his  payoff. 
However,  a  player  might  not  believe  that  a  change  of  strategy  would  be 
unilateral,  if  an  opponent  has  made  a  threat  that  he  would  not  wish  to 
carry  out.  The  requirement  that  all  threats  be  credible  in  the  sense 
that  each  player  would  actually  wish  to  carry  out  all  his  threats  if 
push  came  to  shove  is  captured  in  the  idea  of  subgame  perfectness 
[Selten,  1973].  It  reflects  mutual  appreciation  of  human  rationality, 
since  a  threat  is  credible  if  a  rational  opponent  would  carry  it  out. 

The  issue  of  credibility  does  not  naturally  arise  in  games  represented  in 
normal  or  strategic  form,  where  all  players  make  a  single  simultaneous 
move.  On  the  other  hand,  we  can  obtain  a  stronger  perfectness  concept  by 
changing  our  focus  from  credibility  to  fallibility.  Suppose  that  player  A 
does  not  believe  that  player  B  will  carry  out  a  threat  because  it  is  not  in 
B's  interest  to  do  so.  If  the  players  are  in  equilibrium,  this  means  that 
player  B  is  not  called  upon  to  carry  out  his  threat  unless  player  A  changes 
his  move.  In  the  normal  form  of  the  game,  player  B  commits  himself  to  all 
the  contingencies  in  his  strategy  when  he  declares  it.  Therefore,  player  A 
will  know  that  the  threat  would  be  carried  out  automatically.  In 
committing  himself  to  the  threat,  player  B  is  relying  on  player  A's  ability 
to  avoid  invoking  it.  If  player  B  thought  that  player  A  would  be  unable  to 
carry  out  his  declared  strategy  (which  avoids  the  threat)  with  perfect 
accuracy,  he  would  not  wish  to  commit  himself  to  the  costly  threat. 

While  subgame  perfectness  means  that  players  view  each  other's 
strategies  as  products  of  a  rational  mind,  the  idea  of  human  fallibility 
outlined  above  suggests  that  no  player  should  place  too  much  reliance  on 
the  rationality  of  his  opponents.  The  emphasis  has  shifted  from  acts  to 
beliefs.  In  subgame  perfectness,  a  player's  acts  must  be  rational  given 
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the  belief  that  the  other  players  will  follow  their  declared  strategies 
for  the  "rest  of  the  game."  By  contrast,  the  idea  of  fallibility  rests 
on  beliefs  which  are  not  derived  from  declared  strategies. 

There  are  several  ways  of  capturing  this  idea  in  a  solution.  Perhaps  the 
simplest  is  the  concept  of  trembling  hand  perfectness  [Selten,  1975).  An 
equilibrium  is  trembling  hand  perfect  iff  it  is  the  limit  of  a  sequence  of 
completely-mixed  n-tuples  of  strategies  in  which  players  are  required  to 
use  inferior  strategies  with  asymptotically  negligible  probability.  The 
beliefs  of  the  players  are  embedded  in  the  relative  probabilities  with 
which  strategies  are  used.  Each  player  has  the  same  model  of  independent 
mistakes  or  trembles  by  the  other  players,  and  the  model  is  selected  to  fit 
a  given  equilibrium. 

The  idea  that  players  might  face  situations  in  which  their  beliefs  about 
prior  or  concurrent  actions  determine  their  choices  can  also  be  expressed 
in  extensive-form  games.  If  a  player  reaches  a  position  he  should  not 
have  reached  had  players  been  adhering  to  their  declared  strategies,  he 
must  have  some  beliefs  as  to  what  occurred.  If  he  knows  exactly  what 
occurred,  we  may  proceed  as  in  the  definition  of  subgame  perfectness.  If 
not,  each  player  will  need  a  model  of  what  went  wrong.  The  concept  of 
sequential  equilibrium  [Kreps  and  Wilson,  1982a)  uses  an  explicit  set  of 
beliefs  to  support  behavior  at  "unreachable"  positions.  This  is  a  weaker 
concept  than  trembling  hand  perfectness,  since  the  beliefs  are  not 
required  to  entail  completely  mixed  independent  trembles  by  the  other 
players.  In  addition,  players  are  not  required  to  use  the  same  model  of 
beliefs.  Therefore,  every  trembling  hand  perfect  equilibrium  is  subgame 
perfect,  and  every  subgame  perfect  equilibrium  is  sequential. 

Both  sequentiality  and  trembling  hand  perfection  support  particular 
outcomes  by  particular  beliefs.  In  addition,  both  sequentiality  and 
subgame  perfectness  require  players  to  evaluate  the  conduct  of  other 
players  in  the  rest  of  the  game  according  to  their  declared  strategies, 
even  in  the  face  of  evidence  that  other  players  are  not  adhering  to 


their  declared  strategies.  Since  everyone  must  know  the  beliefs  of 
everyone  else  in  order  to  satisfy  the  intuitive  concept  of  credibility, 
it  seems  appropriate  to  look  for  a  solution  that  invokes  human 
fallibility  without  specifying  the  beliefs  that  players  must  hold. 

If  a  player  expects  other  players  to  use  completely  mixed  strategies,  he 
will  never  wish  to  put  any  weight  on  a  weakly  dominated  strategy.  This 
will  be  true  no  matter  what  mistakes  he  expects  them  to  make.  For  this 
reason,  no  trembling  hand  perfect  equilibrium  uses  such  dominated 
strategies.  We  cannot  go  further  without  specifying  the  model  of  mistakes, 
so  we  shall  define  individual  recognition  of  human  fallibility  by  the 
requirement  that  no  player  will  use  a  dominated  strategy. 

B.  Mutual  Recognition  of  Human  Fallibility 

The  second  idea  upon  which  our  proposal  rests  is  that  there  are 
important  differences  between  what  each  person  knows  and  what  is  common 
knowledge  [Aumann,  1976].  An  event  is  common  knowledge  iff  every  finite 
statement  of  the  form  "A  knows  that  B  knows  that  .  .  .  n  knows  whether 
the  event  has  occurred"  is  true.  One  can  regard  common  knowledge  as  the 
end  result  of  a  process  of  inference  [Cave,  1983].  This  is  illustrated 
by  the  following  story: 

In  a  room,  n  people  are  gathered,  r  2:  2  of  whom  are  wearing  red  hats, 
while  the  others  have  blue  hats.  The  hats  were  donned  in  the  dark,  so 
that  no  one  knows  the  color  of  his  own  hat.  After  the  lights  are  turned 
on,  a  bell  is  rung  at  one-minute  intervals,  and  persons  wearing  red  hats 
are  instructed  to  leave  the  room  at  the  next  ring  after  they  discover 
their  hat's  color.  The  bell  rings  for  a  long  time,  but  nobody  leaves  the 
room. 

Then  a  loudspeaker  announces  "There  is  at  least  one  red  hat  in  the 
room."  As  n  2  2,  this  comes  as  no  news  to  anyone.  Nonetheless,  r  rings 
later,  all  the  people  wearing  red  hats  leave  the  room.  As  a  corollary, 
if  by  some  chance  all  the  people  with  red  hats  failed  to  leave  the  room,  at 
the  next  ring  all  those  wearing  blue  hats  would  leave. 
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What  is  important  in  this  story  is  the  fact  that,  while  everyone  knew 
that  there  were  red  hats  in  the  room,  it  was  not  common  knowledge  until 
it  was  announced.  Suppose  r  =  2.  In  this  case,  each  person  wearing  a  red 
hat  would  have  observed  the  presence  of  one  red  hat.  If  person  A  sees 
that  person  B  is  wearing  a  red  hat,  he  knows  that  either  his  own  hat  is 
blue,  in  which  case  person  B  should  see  no  red  hats,  or  person  B  sees  a 
red  hat,  which  must  be  person  A's.  Therefore,  person  A  knows  that  if 
person  B  does  not  leave  the  room  immediately  after  the  announcement,  his 
own  hat  must  be  red.  The  missing  element  was  that  A  did  not  know  that  B 
knew  that  there  was  a  red  hat . 

In  noncooperative  games,  it  is  usual  to  assume  that  the  strategy  spaces  and 
payoff  functions  are  common  knowledge.  While  this  assumption  is  not 
necessary  for  Nash  equilibrium,  it  does  play  a  role  in  the  definition  of 
sequential  equilibrium  and  perfectness  concepts.  Rather  than  requiring 
that  all  players  share  or  recognize  each  other's  beliefs,  we  propose  the 
addition  of  an  additional  piece  of  common  knowledge  to  the  game:  that  no 
player  will  use  a  dominated  strategy. 

This  announcement  will  trigger  a  chain  of  inference.  Each  person  will 
already  have  discarded  his  or  her  dominated  strategies.  After  hearing  the 
announcement,  they  will  no  longer  expect  other  players  to  use  their 
dominated  strategies.  This  means  that  the  "rows"  and  "columns" 
corresponding  to  those  strategies  are  deleted  from  the  description  of  the 
game.  In  the  reduced  game,  there  may  well  be  more  dominations,  and  the 
process  continues  until  no  more  can  be  found. 


This  idea  is  far  from  original  with  this  paper.  It  appears,  for  example, 
in  Luce  and  Raiffa  [1959]  as  "strategies  undominated  in  the  wide  sense". 
Farquharson  [1969]  terms  it  "sophisticated  behavior  under  complete 
information,"  while  Moulin  [1979]  considers  "dominance  solvable"  games  where 
iterated  dominance  reduces  to  a  single  outcome.  However,  this  interpretation 
of  the  solution  is  new. 
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In  the  event  that  some  players  have  more  than  one  strategy  undominated 
in  the  wide  sense,  we  shall  impose  Nash  equilibrium  on  the  reduced  game. 

This  is  the  solution  we  shall  call  "wide  equilibrium." 

C.  Perfection,  Dominance,  and  Wide  Equilibrium 

We  begin  by  defining  the  games  and  solution  concepts  we  shall  use. 

Definition  1:  a  finite  game  is  [N,E,h],  where 

N  =  {l,...,n}  is  the  finite  set  of  players', 

E  -  Z^  x  . . .x  £  is  the  space  of  pure  strategy  n-tuples, 
a  Cartesian  product  of  finite  sets;  and 
h:  £  ■*  Rn  is  the  payoff  function 

The  space  of  mixed  strategies  of  player  i  is  a  simplex  of  dimension  #E^, 

N 

denoted  M^.  The  product  of  the  IT  is  denoted  M  ;  it  is  a  subset  of  C, 
the  simplex  of  dimension  #E,  called  the  space  of  correlated  strategies . 

If  5  e  C,  then  we  denote  by  £(s)  the  probability  with  which  s  z  Z  is 
selected  when  £  is  played. 

The  space  of  correlated  strategies  of  all  players  except  player  i  is  denoted 

C  It  is  a  simplex  of  dimension  where: 

Z  .  =  x  Z. 

-i  J 

We  shall  always  use  the  subscript  "-i"  to  denote  the  exclusion  of  player  i. 
Thus  an  n-tuple  s  of  pure  strategies  in  which  the  strategy  s^  of  player 
i  is  replaced  by  t^  z  is  denoted  (t^,s _^). 

We  can  extend  h  to  a  payoff  function  H:C  -*  Rn,  by: 

H($)  i  Z  £(s)h(s) 
seZ 

N 

Similarly,  if  y  =  (y^,...,y  )  t  M  ,  then  y(s^)  denotes  the  probability 
with  which  player  i  uses  his  pure  strategy  s^  z  S^. 
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Finally,  for  each  player  we  define  an  uppersemicontinuous  best  reply 
correspondence  3^:C_^  by: 

&i(S_i)  5  arg  max  {H±(vi :  V  e  NL) 

When  £  e  M^xC  there  is  no  ambiguity  in  writing  3(5)  for  3^(5_^)- 

The  product  of  the  is  denoted  3,  and  a  Nash  equilibrium  is  any  fixed 
point  of  3:M^  ■+  M^. 


Definition  2:  a  Nash  equilibrium  m*  is  trembling-hand  perfect  iff 
there  exists  a  sequence  y*"  of  strictly  positive  members  of  and  a 
corresponding  sequence  of  positive  numbers  with  the  following 
properties : 

i)  0  and  p1  ■*  m*  as  t  -*  •; 

ii)  for  all  t,i;  yt(si)  2:  e*"  implies  si  t  3i(pt) 

In  words,  a  trembling-hand  perfect  (THP)  equilibrium  is  a  limit  of 
"t -perfect  equilibria:"  completely  mixed  strategy  n-tuples  in  which  every 
"inferior"  pure  strategy  is  used  with  probability  <  e. 


Definition  3:  a  mixed  strategy  nu  e  of  player  i  is  undominated  iff 
for  every  y^'  £  M^,  y/  t  nu,  there  exists  5_^  e  C s.t. 


H .  (m .  ,5  .)  >  K .  (y  . '  ,5  .) 

i  i  “i  i  i  “i 


N 

Definition  4:  Let  U.(N,M  ,H)  denote  the  set  of  undominated  strategies 

N  N  N 

of  player  i  in  the  game  [N,M  ,H] ,  and  let  U  (N,M  ,H)  be  the  Cartesian 

N 

product  of  the  U^(N,M  ,H) .  Since  each  is  a  finite  set,  the  sequence  of 
games  defined  by: 


rQ  h  [n,mn,hj 
rt  5  (N,uN(rt_1),Hi 


has  a  unique  limit  attained  in  a  finite  number  of  iterations.  This  game 

r  W  W 

is  written  [N,M  ,H]  where  M  is  the  space  of  mixed 

N 

strategies  undominated  in  the  wide  sense.  The  pure  strategies  in  M 

w  w  N 

are  denoted  I  .  An  equilibrium  of  [N,M  ,H]  is  a  wide  equilibrium  of  [N,M  ,H] . 

In  general,  the  conjectures  players  entertain  about  each  other's  moves 
should  bear  some  relation  to  what  they  know  about  the  game.  The  effect  of 
the  difference  between  common  knowledge  and  what  is  merely  known  to 
everyone  can  be  seen  in  the  "Chain  Store  Paradox"  analyzed  by  Selten  [1978] 
and  Kreps  and  Wilson  [1982a].  In  that  example,  a  firm  faces  a  sequence  of 
potential  rivals.  If  a  rival  enters,  the  firm  will  lose  money  in  that 
market.  It  has  available  to  it  a  Pareto  inferior  retaliatory  strategy  that 
can  inflict  losses  on  the  entrant.  If  this  game  is  played  with  a  finite 
number  of  rivals  (periods)  under  conditions  of  common  knowledge,  there  is 
no  possibility  of  deterring  entry  via  the  threat  of  retaliation.  It  is 
common  knowledge  that  the  last  potential  entrant  cannot  be  deterred,  since 
it  cannot  help  the  incumbent  to  retaliate.  Therefore,  the  penultimate 
entrant  will  not  be  deterred,  since  the  last  rival  will  enter  no  matter 
what  has  happened  in  the  past.  This,  too  is  common  knowledge,  and  the 
serial  game  collapses  to  the  one  shot  game. 

On  the  other  hand,  deterrence  is  possible  if  the  condition  that  payoffs  are 

common  knowledge  is  relaxed.  This  can  be  illustrated  in  a  model  with 

one  incumbent,  I,  and  two  potential  entrants,  A  and  B.  If  B  assigned 

a  positive  prior  probability  to  a  payoff  function  which  rewards  I 

for  retaliation,  then  I  could  ensure  that  B  would  not  enter  the  market 

by  retaliating  against  A,  thus  causing  B  to  revise  his  estimate  of 

I's  "toughness"  upwards.  Moreover,  even  if  B  did  not  hold  this  belief, 

I  would  be  led  to  carry  out  the  retaliation  against  A  if  he  thought 
that  B  could  be  deterred.  Going  one  step  further,  if  A  thought  that 
I  thought  that  B  could  be  deterred,  then  A  would  anticipate  that  I  would 
retaliate,  and  would  be  deterred.  If  the  stakes  are  high  enough,  and  if 
beliefs  are  thought  to  be  sufficiently  sensitive  to  observation,  then 
deterrence  is  possible  any  time  the  payoffs  are  not  common  knowledge, 
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even  if  (as  in  the  latter  two  cases  above)  they  are  known  to  everyone. 

The  process  of  revision  for  this  example  is  dealt  with  explicitly  in 
Kreps  and  Wilson  [1982a],  and  Milgrom  and  Roberts  [1982]. 

If  we  start  with  the  position  that  payoffs  are  common  knowledge,  there  are 
still  possibilities  for  inference  as  regards  conjectured  behavior  by 
opponents.  In  particular,  under  conditions  of  less-than-perfect  certainty 
about  the  ability  or  willingness  of  other  players  to  carry  out  their 
declared  strategies,  adding  as  common  knowledge  the  statement  that  "no 
player  will  use  a  dominated  strategy"  will  initiate  a  chain  of  inference 
tending  to  a  situation  in  which  the  strategies  actually  considered  by  a 
player  will  be  strategies  undominated  in  the  wide  sense.  Inference  and 
communication  lead  us  to  this  solution  without  the  necessity  of  requiring 
that  all  players  share  a  common,  independent  model  of  each  other's  mistakes 
or  even  that  we  be  able  to  write  down  an  explicit  set  of  beliefs  that  might 
support  observed  behavior. 

Theorem  5:  Let  m*  be  a  trembling  hand  perfect  equilibrium,  and  let 
sA  be  a  dominated  strategy  of  player  i.  Then  m*(si)  =  0. 

Proof:  If  pi  dominates  s^,  and  if  e  C  is  strictly  positive  in  each 

coordinate,  H.(s.,y  ,)  <  H.(y.,y  .).  Therefore  yn(s.)  <  tn  for  all  n, 
l  l  -i  i  i  -i  i'  ’ 

and  y*(s^)  =  0.  QED 

Theorem  6:  Let  s*  be  a  pure  strategy  equilibrium  of  a  two  person  game 

such  that  s^*  is  undominated.  Then  m*  is  trembling-hand  perfect. 

Proof:  Let  mE  be  an  E-perfect  equilibrium,  and  let  us  write: 

m . E  =  ( 1-e )s .*  +  em . ' 

1  1  x 

for  some  completely  mixed  strategy  m^ ' .  The  payoff  to  player  j  t  i  if 
he  uses  the  strategy  s^  can  be  written: 

H.(s.,m.e)  =  (l-e)H. (s . ,s .*)  +  eH.(s.,m.’) 

J  J  i  J  J  i  J  J  i 


The  equilibrium  strategy  s^*  is  a  best  reply  to  s^*,  and  since  it  is 
undominated,  there  is  some  completely  mixed  strategy  m/  such  that  s^* 

is  a  best  reply  to  m/  as  well.  Therefore  it  is  a  best  reply  to  rk  £ . 
Finally,  since  player  i  is  called  upon  to  use  his  perturbed  strategy  m  ' 

£ 

with  total  probability  t,  m  is  an  t -perfect  equilibrium.  QED 

In  some  circumstances,  this  characterization  can  be  extended.  For  example, 
all  completely  mixed  equilibria  are  trembling  hand  perfect,  and  none  places 
positive  probability  on  a  weakly  dominated  strategy.  However,  it  is  easy 
to  show  by  examples  that  i)  not  all  equilibria  in  undominated  strategies 
are  trembling  hand  perfect;  and  ii)  even  pure  strategy  equilibria  in 
undominated  strategies  may  be  imperfect  if  there  are  more  than  two  players. 

i)  Not  all  equilibria  in  undominated  strategies  are  trembling  hand  perfect: 

d  e  f 

+ - + - + - + 

a  1 1,2|3,0|0,0| 

+ - + - + - + 

b  |1,2|0,313,0| 

+ - + - + - + 

c  1 1 ,0 | 2,0 | 2,3 | 

+ - + - + - + 

Example  1 

There  is  an  equilibrium  in  which  player  1  uses  undominated  strategies 
"a"  and  "b"  with  equal  probability,  while  player  2  uses  "d"  with 
probability  1.  However,  in  any  e-perfect  equilibrium  where  player  1  is 
indifferent  between  "a"  and  "b,"  neither  can  be  a  best  reply. 

ii)  Not  all  pure  undominated  strategy  equilibria  are  perfect  if  n  >  2: 

If  a  pure  strategy  is  undominated  there  is  some  correlated  strategy  of  the 
other  players  to  which  it  is  a  best  reply.  However,  that  strategy  may  not 
be  independent  as  required  by  t-perfect  equilibrium.  In  addition,  the 
"perturbation"  a  player  uses  to  make  one  player  use  a  given  undominated 
equilibrium  strategy  may  differ  from  the  "perturbation"  required  to  make 
another  player  use  his  undominated  equilibrium  strategy.  An  example  of 
this  second  type  of  failure  is  given  by  the  three  person  game  below. 
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c 

d 

c 

d 

*  1,1,1 

—  - 

o 

o 

1 

>— * 
o 

1 

a | *  0,0,1 

1*  0,0,1 

1 

• 

f 

1 

i  o 

I  o 

1  h-* 

1 

|*  0,0,1 
-+ - 

1 

-+ 

b|*  0,0,1 
+ - 

|  0,0, -10 
-+ - 

1 

-+ 

Example  2 

The  pure  strategy  equilibria  are  indicated  with  asterisks.  Strategies 
"a"  and  "c"  are  weakly  dominant  for  players  1  and  2  respectively,  so  the 
only  possible  pure  strategy  perfect  equilibria  are  (a,c,e)  and  (a,c,f). 

Both  "e"  and  "f"  are  undominated.  However,  (a,c,e)  cannot  be  perfect. 

If  player  1  plays  "b"  with  probability  £  (close  to  0)  and  player  2 
plays  "d"  with  probability  w  (close  to  0),  player  3's  payoff  if 
he  plays  "e"  is  1  -  ll(l-e)w;  his  payoff  if  he  plays  "f"  is 
1  -  lltw.  He  will  therefore  only  choose  "e"  if  e  >  1/2. 

N 

Theorem  7:  Every  game  [N,M  ,H]  possesses  at  least  one  wide  equilibrium. 

Proof:  By  construction,  H  is  continuous  on  the  compact  and  nonempty  set 

w  w  w 

M  .  Therefore,  the  best  reply  correspondence  &  for  the  game  [N,M  ,H] 

is  uppersemicontinuous .  In  addition,  the  set  of  mixed  strategies  of  player  i 

w  w 

dominated  by  a  member  of  M.  is  convex,  so  that  is  convex-valued, 

l  l 

w 

and  (N,M  ,H]  has  an  equilibrium  m*,  which  is  also  an  equilibrium  of  the 
N 

original  game  [N,M  ,H] .  QED 


Every  wide  equilibrium  is  a  fortiori  undominated,  so  every  pure 
strategy  wide  equilibrium  in  a  two-person  game  is  perfect,  although  the 
Prisoners'  Dilemma  example  of  the  next  section  shows  that  the  converse  is 
false.  Also,  not  every  wide  equilibrium  is  perfect.  In  Example  1  above, 
all  equilibria  are  wide  equilibria,  since  no  strategy  is  dominated.  This 
includes  the  imperfect  equilibrium  in  which  player  1  uses  "a"  and  "b" 
with  probability  1/2  each,  and  player  2  uses  "d"  with  probability  one. 


D.  Prisoners'  Dilemma  with  Finite  Memory 

In  this  section,  we  concentrate  on  an  infinitely-repeated  Prisoners' 

Dilemma  game,  in  which  players  do  not  discount  the  future,  and  in  which 
their  strategies  are  constrained  to  depend  only  on  the  outcome  of  the 
previous  round  of  play.  We  shall  examine  two  variants  of  this  example. 

In  the  first,  a  player's  move  at  any  stage  depends  only  on  his  opponent's 
previous  move.  In  the  second,  a  player's  move  may  depend  on  both  his  own 
and  his  opponent's  previous  moves.  In  each  game  we  shall  consider  pure 
strategy  Nash,  perfect,  and  wide  equilibria.  This  extends  previous  work  by 
Aumann,  Kurz,  and  the  author  [1978],  in  which  it  was  shown  that  Tit-for-Tat 
is  the  unique  wide  equilibrium  of  the  first  variant. 

The  Prisoners'  Dilemma  we  shall  use  is: 

L  R 

+ - + - + 

T|l,l|4,0| 

+ - + — + 

B|0,4|3,3| 

It  has  a  unique  equilibrium  in  dominant  strategies  at  (T,L).  The  Appendix 
extends  the  reactive  memory  analysis  to  a  general  Prisoners'  Dilemma. 

One  Period  ("Reactive")  Memory  of  Opponent's  Move 

If  this  game  is  played  as  an  undiscounted  supergame  in  which  a  player's 
move  is  allowed  to  depend  on  his  opponent's  last  move,  we  obtain  a 
normal  form  game  in  which  each  player  has  eight  pure  strategies.  A 
strategy  for  player  1  [2]  is  written  (a,b,c)  [(d,e,f)]  where: 
a  [d]  is  the  move  in  the  first  period; 

b  [e]  is  the  move  when  the  opponent's  previous  move  was  R  [B] ;  and 

c  [f]  is  the  move  when  the  opponent's  previous  move  was  L  [T] . 

As  written,  each  player  has  two  redundant  strategies.  Since  we  are 

concerned  with  long-term  average  payoffs,  the  noncooperative  strategies 

(B ,T,T)  [ (R,L,L) ]  and  (T,T,T)  [(L,L,L)]  are  the  same,  as  are  the  cooperative 
strategies  (T,B,B)  [(L,R,R)]  and  (B,B,B1  [(R,R,R)].  The  payoff  matrix  for 
the  reduced  game  is  shown  below.  It  has  two  pure  strategy  equilibria: 

[ (B ,B ,T) , (R,R , L) ]  with  payoff  (3,3),  and  [ (T,T,T) , (L,L,L) ]  with  payoff  (1,1). 


|L,L,L|L,L,R|L,R,L|R,L,R|R,R,L|R,R,R| 

+ - + . + - + . + . + - + 

T.T.TI  1,  1|  4,  0|  1,  1|  4,  0|  1,  1|  4,  0| 

+ . + - + . + - + . + - + 


T,T,B|  0,  4 1  2,  2 1  2,  2|  4,  0|  2,  2| 


+ . + - + - + . + - + - + 

T,B ,T|  1,  1|  2,  2 |  1,  1|  2,  2 |  2,  2|  3,  3| 

+ . + - + . + . + . + - + 

8 ,T,B |  0,  4 |  0,  4 |  2,  2|  2,  2|  2,  2|  4,  0| 

+ . + . + . + . + . + - + 

B,B,T|  1,  1|  2,  2 |  2,  2 |  2,  2|  3,  3|  3,  3| 

+ . + . + - + - + . + - + 

B ,B ,B |  0,  4 1  0,  4 j  3,  3|  0,  4|  3,  3|  3,  3| 

+ . + - + . + . + - + - + 

The  payoff  matrix  for  the  reduced  game 


T,T,T|  1,  1|  4,  0 j  1,  1|  4,  0 1 

+ - + . + . + - ^ 

T,T,B|  0,  4 |  2,  2 |  2,  2 |  4,  0| 

+ . + . + . + - i 

B,B,T|  1,  1|  2,  2 |  3,  3 |  3,  3| 

+ - + . + . + . h 

B,B,B|  0,  4 |  0,  4 |  3,  3 |  3,  3| 

+ - + - + . + - ^ 

Stage  2 


Each  player  has  two  dominated  strategies.  The  initially  cooperative  "Tat- 
for  Tit"  strategy  that  responds  to  cooperation  with  greed  and  vice  versa , 
(B,T,B)  [(R,L,R)],  is  dominated  by  the  initially  greedy  Tat-for-Tit 
strategy  (T,T,B)  [(L,L,R)].  Also,  the  initially  greedy  "Tit-for-Tat" 
strategy  (T,B,T)  [(L,R,L)]  is  dominated  by  the  initially  cooperative  Tit- 
for-Tat  strategy  (B,B,T)  [(R,R,L)].  As  neither  dominated  strategy  is  an 
equilibrium  strategy,  both  equilibria  are  trembling  hand  (and  thus  subgame) 
perfect.  Eliminating  dominated  strategies,  we  get  the  Stage  2  matrix. 

At  Stage  2,  the  "appeasement"  strategy  (B,B,B)  [(R,R,R)]  is  dominated  by 
the  Tit-for-Tat  strategy  (B,B,T)  [(R,R,L)].  This  leads  to  Stage  3  (below) 
at  which  Tit-for-Tat  dominates  the  initially  greedy  Tat-for-Tit  strategy 
(T,T,B)  [(L,L,R)].  Finally,  at  Stage  4  Tit-for-Tat  dominates  the 
unrelentingly  noncooperative  strategy  (T,T,T)  [(L,L,L)].  Therefore,  the 
only  wide  equilibrium  of  this  game  is  the  Pareto  optimal  Tit-for-Tat 
equilibrium  ( (B ,B ,T) , (R,R,L) ] . 


L,L,L| L,L,R|R,R,L| 

+ . + . + . + 

T,T,T|  1,  1|  4,  0 |  1,  1| 

+ . + . + . + 

T,T, B |  0,  4 |  2,  2 |  2,  2| 

+ . + . + . + 

B,B,T|  1,  1|  2,  2 1  3,  3 1 

+ . + . + . + 

Stage  3 


L,L,L|R,R,L| 

+ . + . + 

T,T,T|  1,  1|  1,  1| 

+ . + . + 

B , B ,T |  1,  1|  3,  3 1 

+ - + - + 

Stage  4 


One  Period  ("Reactive-Signalling")  Memory  of  Both  Players'  Moves 

In  this  version  of  the  game  each  player  has  five  information  sets,  and 
therefore  32  pure  strategies.  The  generic  pure  strategy  can  be  written 
(v,w,x,y,z) ,  where: 

v  is  the  player's  initial  move; 

w  is  the  player's  move  if  the  previous  moves  were  (T,R) 

x  is  the  player's  move  if  the  previous  moves  were  (T,L) 

y  is  the  player's  move  if  the  previous  moves  were  (B,R) 

z  is  the  player's  move  if  the  previous  moves  were  (B,L) 

As  in  the  reactive  memory  case,  there  are  some  redundant  strategies: 
(B,B,B,B,B),  (B,T,B,B,B),  (B,B,T,B,B),  and  (B,T,T,B,B)  are  all  strategies 
of  perpetual  cooperation,  as  are  (R,R,R,R,R) ,  (R,L,R,R,R),  (R,R,R,L,R),  and 
(R,L,R,L,R).  Similarly,  (T,T,T,T,T),  (T,T,T,T,B),  (T,T,T,B,T), 

(T,T,T,B,B) ;  (L,L,L,L,L),  (L,L,L,L,R),  (L,L,R,L,L),  and  (L,L,R,L,R)  are 
all  perpetually  noncooperative  strategies. 

Associated  with  any  pair  of  strategies  is  a  cycle  of  at  most  four  outcomes 
that  will  result  if  the  strategies  are  used.  As  this  is  an  undiscounted 
supergame,  the  order  of  play  within  the  cycles  does  not  matter,  and  the 
cycles  provide  a  convenient  way  of  presenting  the  payoff  matrix.  The  cycles 
and  associated  payoffs  are  shown  below,  together  with  the  ranking  for 
each  player. 


label  cycle 

payoff  | 

label 

cycle 

payoff 

|  label 

cycle 

payoff 

a  BR 

3 

,3  | 

f 

BR,TR 

7/2, 3/2 

1  k 

BR,BL,TR 

7/3, 7/3 

b  BL 

0 

,4  1 

g 

BR,TL 

2  ,2 

|  1 

BR,BL,TL 

4/3, 8/3 

c  TR 

4 

,0  | 

h 

BL,TR 

2  ,2 

1  m 

BR,TR,TL 

8/3, 4/3 

d  TL 

1 

.1  1 

i 

BL,TL 

1/2, 5/2 

1  n 

BL,TR,TL 

5/3, 5/3 

e  BR,BL 

3/2 

,7/2  | 

j 

TR,TL 

5/2, 1/2 

1  o 

all 

2  ,2 

Player  1:  c 

>  f 

>  a  >  m 

> 

j  > 

k  >  h  = 

o  =  g  > 

n  >  e  > 

1  >  d  >  i 

>  b 

Player  2:  b 

>  e 

>  a  >  1 

> 

i  > 

k  >  h  = 

o  =  g  > 

n  >  f  > 

m  >  d  >  j 

>  c 

The  following  table  shows  the  outcomes  as  a  function  of  the  nonredundant 
strategies,  which  have  been  numbered  for  convenience. 
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V 

|L|R|L|R|L|R|L|R|L|L|R|L|R|L 

R 

L|L|R|L|R|L|R|R|R|L|R| 

w 

|R|R|R|R|R|R|R|R|R|R|R|R|R|R 

R 

L|L|L|L|L|L|L|L|L|L|L| 

1 

X 

|R|R|R|R|L|L|L|L|R|R|R|L|L|L 

L 

R|R|R|L|L|L|L|R|L|L|L| 

y 

|R|R|R|R|R|R|R|R|L|L|L|L|L|L 

L 

R|R|R|R|R|R|R|L|L|L|L| 

• 

2 

|R|R|L|L|R|R|L|L|R|L|L|R|R|L 

L 

R|L|L|R|R|L|L|L|R|L|L| 

v,w,x,y,z 

i  i  i  i  I  i  I  i  i  1 1 1 1 1 1 1 1  i  i 

1 

1|1|1|1|2|2|2|2|2|2|2| 

^  • 

|1|2|3|4|5|6|7|8|9|0|1|2|3|4 

5 

6|7|8|9|0|1|2|3|4|5|6| 

1 

1 

T,B,B,B,B 

|a|a|e|e|a|a|e|eja|b|b|a|b|b 

b 

a|e|e|a|a|e|e|b|b|b|b| 

• 

2 

B,B,B,B,B 

|a|a|e|e|a|a|e|e|b|b|b|b|a|b 

b 

a|e|e|a|a|e|e|b|a|b|b| 

3 

T,B,B,B,T 

|f|f|g|g|k|k|g|g|fig|g|b|b|g 

b 

f  1 1 1 1 |k|k| 1 1 1 |b|b|b|bf 

4 

B , B , B , B  ,T 

ififigjgikjkigigibjb|g|bjbib 

g 

f|l|l|k|k|l|l|b|b|b|b| 

5 

t.b.b.t.b 

|a|a|k|k|a|h|h|h|a|l|l|a|a|l 

1 

a|k|k|h|h|h|h|i|i|i|i| 

r 

6 

b,b,b,t,b 

|a|a|k|k|h|a|h|h|a]l|l|a|a|l 

1 

a|k|k|h|a|h|h|i|a|i|i| 

j 

4 

* 

7 

T,B,B,T,T 

|f|f|g|g|h|h|g|h|£|g|g|o|o|g 

g 

f |o|o|h|h|h|h|i|i|i|i| 

• 

8 

b,b,b,t,t 

|f|f|g|g|h|h|h|g|f|g|g|o|o|g 

g 

f |o|o|h|h|h|h|i|i|i|i| 

9 

t,b,t,b,b 

|a|c|e|c|a|a|e|e|a|b|c|a|a|b 

b 

aie|cja|ajejeicib|b|b| 

10 

T,B,T,B,T 

icjcigicjmimjgjgjcjgjc|mim|g 

g 

c|l[c|o(o|l|l|c|b|b|b{ 

- 

11 

b.b.t.b.t 

lc|c|g|g|m|m|g|g|b|b|g|b|m|b 

g 

c|l|l|o|o|l|l|b|b|b|b| 

12 

T,B ,T,T,B 

|a|c|c|c|aja|o|o|a|l|c|a|a|l 

1 

c|c|c|n|n|n|n|c| i| i| i| 

• 

• 

13 

B,B,T,T,B 

|c|a|c|c|a|a|o|o|a|l|l|a|a|l 

1 

cj  c|  c  |  n  |  a  | n j  n|  i  |  a  |  i  |  i  | 

• 

14 

T,B ,T,T,T 

ic|c|g|c|m|m|gjg|cjgjcjmjm|g 

g 

cjc|c|n|njnjn|c| ij ij i| 

_  - 

15 

B ,B ,T,T,T 

jc|c|c|g|m|m|g|g|c|g|g|m|m|g 

g 

cjc|c|n|n|n|n|i|i|i|i| 

16 

T,T,B,B,B 

|a|a|eje|a|a|e|e|a|b|b|b|b|b 

b 

d|d|e|d|a|d|e|b|b|d|b| 

17 

T,T,B,B,T 

| f| flmlmiklkjojoi fjmjmibjbjb 

b 

d|d|d|d|k|d|d|d|b|d|b| 

* 

18 

B.T.B.B.T 

jfjfjmjmjkjkjojoibjbjmjbjbjb 

b 

f|d|d|k|k|d|d|d|b|b|d| 

- 

*— ^ 

s 

19 

T,T,B,T,B 

iajaikjk|h|h|h|hja{o|o|n{n|n 

n 

d|d|k|d|h|d|h|d|d|d|d| 

r  • 

20 

b,t,b,t,b 

|aja|k|k|h|a|h|h|ajojojn|ajn 

n 

a|k|k|h|a|h|h|d|a|d|d| 

• 

21 

T,T,B ,T,T 

|  f  j  f  |  m  j  m  j  h  j  h  j  h  j  h  j  f  |  m  |  m  j  n  |  n  |  n 

n 

d|d|d|d|h|d|h|d|d|d|d| 

22 

B,T,B,T,T 

j  f| fjm|m|hjh|hihj  fjm]minjn|n 

n 

f |d|d|h|h|h|d|d|d|d|d| 

• 

23 

B,T,T,B,T 

|c|c|c|c|j|j|j|j|b|b|c|b|j|b 

j 

c|djd|d|d|d|d|d|d|b|d| 

24 

B,T,T,T,B 

|c|a|c|c|j|a|j|j|c|c|c|j|ajj 

j 

c|c|c|d|a|d|d|d|a|d|d| 

i* 

25 

T,T,T,T,T 

|  C  j  C  j  C  j  C  j  j  I  j  1  j  1  j  j  c  j  c  j  c  1  j  j  j  1  j 

j 

d|d|c|d|d|d|d|c|d|d|d| 

*  • 

4 

26 

B,T,T,T,T 

|c|c|c|c| j|j|j|jic|c|cjjjj)j 

j 

c|c|d|d|d|d|d|d|d|d|d| 

Outcomes  for  the  one-period  reactive-signalling  memory  game 

This  game  has  26  pure  equilibria  with  the  cooperative  outcome  "a;"  4  pure 
equilibria  with  the  noncooperative  outcome  "d;"  and  2  pure  equilibria  with 
the  semicooperative  outcome  "h"  (players  take  turns  cooperating).  None  uses 
a  weakly  dominated  strategy,  so  all  are  trembling  hand  perfect.  They  are: 

Cooperative  equilibria:  {5,12, 13)2;  {6, 13,20, 24} 2 ;  (6,12);  (12,6). 

2 

Noncooperative  equilibria:  {25,26}  . 

Semi-cooperative  equilibria:  (21,22);  (22,21). 


The  following  table  shows  the  equivalence  between  strategies  of  the  two 
players,  the  stages  at  which  they  are  eliminated,  and  the  strategies  that 
dominate  them. 


Dominated  strategy  Dominating  strategy 


Stage 

|  Player  1 

j 

Player  2 

1 

1 

Player  1 

Player  2 

1 

|  (T,B,B,B,B) 

1 

(L,R,R,R,R) 

1 

1 

(T,B,T,B,B) 

9 

(L,R,R,L,R) 

1 

|  (B,B,B,T,T) 

8 

(R,R,L,R,L) 

= 

(T,B,B,T,T) 

7 

(L,R,L,R,L) 

1 

|  (B,B,T,B,T) 

11 

(R,R,R,L,L) 

1 

(T,B,T,B,T) 

10 

(L,R,R,L,L) 

1 

I  (T ,T,B ,T,B) 

19 

(L,L,L,R,R) 

1 

(B ,T,B ,T,B) 

20 

(R,L,L,R,R) 

1 

|  (B,T,T,B,T) 

23 

(R,L,R,L,L) 

1 

(B,T,T,T,T) 

26 

(R,L,L,L,L) 

2 

|  (B.B.B.B.B) 

2 

(R,R,R,R,R) 

1 

(B,B,T,T,B) 

13 

(R,R,L,L,R) 

3 

|  (T,B,T,B,B) 

9 

(L,R,R,L,R) 

1 

(B,B,T,T,B) 

13 

(R,R,L,L,R) 

3 

|  (T,B,T,T,B) 

12 

(L,R,L,L,R) 

1 

(B,B,T,T,B) 

13 

(R,R,L,L,R) 

3 

|  (T,T,B,B,T) 

17 

(L,L,R,R,L) 

1 

(B,T,T,T,B) 

24 

(R,L,L,L,R) 

3 

|  (B,T,B,B,T) 

18 

(R,L,R,R,L) 

1 

(B,T,T,T,B) 

24 

(R,L,L,L,R) 

3 

|  (T,T,T,T,T) 

25 

(L,L,L,L,L) 

1 

(B ,T,T ,T,B) 

24 

(R,L,L,L,R) 

3 

|  (B,T,T,T,T) 

26 

(R,L,L,L,L) 

1 

(B,T,T,T,B) 

24 

(R,L,L,L,R) 

4 

|  (B,B,B,T,B) 

6 

(R,R,L,R,R) 

1 

(B,T,B,T,B) 

20 

(R,L,L,R,R) 

4 

|  (T,T,B,B,B) 

16 

(L,L,R,R,R) 

1 

CB,B,T,T,B) 

13 

(R,R,L,L,R) 

5 

|  (B,B,B,B,T) 

4 

(R,R,R,R,L) 

1 

(B,B,T,T,B) 

13 

(R,R,L,L,R) 

6 

|  (T.B.T.T.T) 

14 

(L,R,L,L,L) 

1 

(B,B,T,T,T) 

15 

(R,R,L,L,L) 

6 

|  (T.B.B.B.T) 

3 

(L,R,R,R,L) 

1 

mixed  domination 

The  matrix  corresponding  to  strategies  undominated  in  the  wide  sense  is: 
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1  L| 

Rl 

LI 

Rl 

R| 

Rl 

LI 

Rl 

R| 

w 

1  R| 

Rl 

Rl 

Rl 

R| 

L| 

L| 

L| 

L| 

X 

1  L| 

L| 

Rl 

L| 

L| 

L| 

L| 

L| 

L| 

y 

1  Rl 

Rl 

L| 

L| 

L| 

Rl 

Rl 

Rl 

L 1 

z 

1  R| 

L| 

LI 

R| 

L| 

Rl 

L| 

LI 

Rl 

v,w,x,y,z 

1  5| 

7| 

10| 

13 1 

15 1 

20 | 21 | 22 | 

24 1 

5 

T,B,B,T,B 

l*a| 

h| 

11 

*«l 

11 

h| 

h| 

h  I 

i  1 

7 

b,b,b,t,t 

1  h| 

gl 

gl 

°i 

gl 

h| 

h| 

h| 

i| 

10 

T,B,T,B,T 

1  m 

gl 

gl 

m| 

gl 

o| 

H 

11 

b  I 

13 

b.b.t.t.b 

l*«l 

o| 

H 

*a| 

11 

*a| 

n| 

n| 

*a| 

15 

B,B,T,T,T 

1  ml 

gl 

gl 

mj 

gl 

n| 

n| 

n| 

il 

20 

b,t,b,t,b 

1  h| 

h| 

o| 

*a| 

n| 

*a| 

h  I 

h  I 

*a  1 

21 

T,T,B ,T,T 

1  h| 

h| 

m| 

n| 

n| 

h| 

d|*h| 

d  I 

22 

B ,T,B ,T,T 

1  h| 

h| 

m| 

n| 

n| 

h  |  *h  | 

d  I 

d  I 

24 

B ,T,T,T,B 

1  Jl 

Jl 

c| 

*a| 

jl 

*a| 

d| 

d  I 

*a| 

*  =  wide  equilibrium 


This  game  has  11  fully  cooperative  wide  equilibria.  In  each  of  them, 
players  will  play  Tit-for-Tat  if  their  own  previous  move  was  cooperative. 
In  addition,  both  of  the  semicooperative  equilibria  with  outcome  "h"  are 


wide  equilibria  in  which  players  play  Tit-for-Tat  if  their  own  previous 
move  was  noncooperative.  There  are  no  wide  equilibria  with  the  noncooperative 
outcome  "d,”  the  Pareto-inferior  equilibrium  of  the  one-shot  game.  Indeed, 
the  noncooperative  equilibria  are  eliminated  earlier  in  this  game  than  in 
the  reactive  memory  game. 

In  any  event,  wide  equilibrium  resolves  the  Prisoners'  Dilemma  for  the 
one-period  reactive  memory  game.  There  are  Pareto-inferior  wide  equilibria 
in  the  reactive -signal ling  memory  game,  but  they  are  neither  as  inefficient 
nor  as  compelling  as  the  inefficient  dominant  strategy  equilibrium  of  the 
one  shot  game.  It  remains  an  open  question  to  what  extent  this  resolution 
by  wide  equilibrium  depends  on  limited  memory.  In  particular,  it  would  be 
interesting  to  compute  the  wide  equilibria  of  the  full  supergame  for  the 
reactive  memory  and  reactive-signalling  memory  cases. 

The  learning  interpretation  of  wide  equilibrium  makes  this  example  an 
attractive  metaphor  for  the  evolution  of  cooperation,  in  which  increasing 
common  knowledge  about  which  strategies  will  not  be  used  leads  to  the 
elimination  of  the  worst  kind  of  noncooperative  behavior. 

Since  the  chain  of  inference  leading  to  wide  equilibrium  was  triggered 
by  the  common  knowledge  statement  that  no  player  would  use  a  dominated 
strategy,  and  since  the  use  of  dominated  strategies  is  ruled  out  by 
expectations  of  human  fallibility,  it  is  tempting  to  view  the  realistic 
cooperation  involved  in  the  wide  equilibrium  strategies  as  a  result 
of  mutual  recognition  of  human  fallibility. 


Appendix:  Reactive  Memory  in  the  General  Case 

In  this  section,  we  show  that  the  Tit-for-Tat  cooperative  strategies 
form  the  only  wide  equilibrium  in  a  general  Prisoners’  Dilemma  game.  Th 
payoff  function  is: 


L  R 

+ + + 

T|o,o|&,lf| 

+ + + 

B|*,e|6,6| 

+ — + — + 

where 

(1) 

3  >  6  >  a  >  1 

(2) 

2«  2  M  1 

Assumption  (1)  guarantees  that  the  unique  dominant -strategy  equilibrium 
of  the  one-shot  game  is  Pareto-inferior,  and  assumption  (2)  ensures  that 
the  "cooperative"  outcome  BR  is  Pareto  optimal. 

Players  are  allowed  to  react  to  their  opponent's  previous  moves.  After 
redundant  strategies  are  eliminated,  each  player  has  six  pure  strategies 
described  in  the  same  way  as  in  Section  D  above.  There  are  seven 
possible  outcome  cycles.  These  cycles,  and  the  resulting  payoffs  are: 


label 

cycle 

payoff 

label 

cycle 

payoff 

a 

BR 

(6,6) 

g 

BR,TL 

(o+6 ,o+6)/2 

b 

BL 

(y,B) 

h 

BL,TR 

(3+r,3+ar)/2 

c 

TR 

(&,*) 

o 

all 

(a+f5+I+6  ,a+S+2f+6)/4 

d 

TL 

(<*,<*) 

The  players  rank  these  cycles  as  follows: 


player  1:  c  >  a  >  max{h,g}  £  o  £  min{h,g}  >  d  >  b, 
player  2:  b  >  a  >  max{h,g)  £  o  2  min{h,g}  >  d  >  c 


The  matrix  for  the  game  is  shown  below  (*  =  pure  strategy  equilibrium). 


-  A2 


(L,L,L) | 

(L 

L,R) 

| (L,R,L) | 

1  1 

(R.L.R) 

| (R,R,L) | 

i  l 

(R,R,R) 

(T.T.T) 

d*  | 

c 

1  1 
|  d  | 

c 

!  d  | 

c 

(T.T.B) 

b  I 

g 

1  o  | 

c 

1  °  1 

c 

(T.B.T) 

d  j 

o 

|  d  | 

o 

!  h  | 

a 

(B.T.B) 

b  j 

b 

|  o  | 

g 

1  °  1 

c 

(B,B,T) 

d  | 

o 

1  h  | 

o 

1  a*  | 

a 

(B,B,B) 

b  j 

b 

1  a  | 

b 

|  a  | 

a 

On  the 

first  round. 

(B,T 

,B)[(R,L, 

R)]  is 

dominated 

by  (T,T,B) [ (L,L,R) ] , 

(T,B,T) 

[ (L,R,L) ] 

is 

dominated  by 

(B,B,T) 

[ (R,R,L)] 

if  B+*  *  2a,  and 

(T,B,T) 

[ (L,R,L) ] 

is 

dominated  by 

(T.T.T) 

[(L.L.L)] 

if  2a  £  &+y. 

Therefore,  the  matrix  of  undominated  strategies  is: 


(L.L.L) 

KL.L.R) 

l 

| (R,R,L) | 

l  1 

(T.T.T) 

d* 

1 

1  C 

1  1 

1  d  1 

(T,T,B) 

b 

1  g 

1  o  | 

(B,B,T) 

d 

1  O 

i  a*  1 

(B.B.B) 

b 

1  b 

1  a  | 

Both  pure  strategy  equilibria  are  trembling  hand  perfect  by  virtue  of 
Theorem  6.  In  this  game,  the  naive  cooperative  strategy  (B,B,B) 

[ (R,R,R) ]  is  dominated  by  the  more  realistic  Tit-for-Tat  strategy, 
leaving 

(L,L,L)|(L,L,R)|(R,R,L) 

I  I 

(T,T,T)  d*  |  c  |  d 

(T,T,B)  b  |  g  |  o 

(B,B,T)  d  |  o  |  a* 

Player  1(2]  prefers  c[b]  to  g,  so  (T,T,B) [ (L,L,R) ]  is  dominated  by 
(T,T,T) [L,L,L) ]  if  the  players  prefer  d  to  o,  i.e.  if 

(3)  Aa  2:  a  +  3  +  y  +  6 

By  the  same  token,  both  players  prefer  a  to  o,  so  (T,T,B) [ (L,L,R) ]  is 
dominated  by  (B ,B ,T) [ (R,R,L) ]  if  the  players  prefer  o  to  g,  i.e.  if 

(4)  a  +  (5+y+62  2(a  +  6)  >  4a 


-  A3 


Finally,  suppose  that 

(5)  2(o  +6)>a+0+y+6>  4o 

In  this  case,  (T,T,B) [ (L,L,R) ]  is  dominated  by  a  mixed  strategy,  which 
can  be  written  X(T,T,T)  +  (1-X)(B,B,T)  (X(L,L,L)  +  (1-X) (R,R,L) ]  for 
some  X  e  (0,1).  W.l.o.g.,  we  shall  concentrate  on  player  l's  payoff.  The 

mixed  strategy  clearly  is  a  best  reply  to  (L,L,L),  so  there  are  two 
conditions  for  it  to  dominate  (T,T,B): 

(6)  a  +  6  <  X0  +  (1-X) (a  +  0  +  I  +  6) 

(7)  a  +  0  +  y  +  6  <  Xa  +  (1-X)6 

The  condition  for  the  existence  of  such  a  mixed  strategy  is  therefore: 

(8)  (6  -  o) (0  -  (e+6 )/2)  2:  (x  -  a)(0  -  x) ,  where 
X3  (a  +  0  +  y  +  6)/4 

(5)  implies  that  (6  -  a)  >  2(X  -  a),  and  (1)  implies  that 

20  -  «  -  6  >  0  -  x ,  so  that  (8)  is  automatically  satisfied.  Therefore 

(T,T,B) [ (L,L,R) ]  is  dominated,  and  the  strategy  matrix  reduces  to: 

(L,L,L) | (R,R,L) 

I 

(T,T,T)  d*  |  d 

(B,B,T)  d  |  a* 

in  which  (B,B,T)  dominates  (T,T,T).  Therefore,  the  only  wide  equilibrium 
is  the  cooperative  Tit-for-Tat  equilibrium  (B,B,T) , (R,R,L) . 


,  • 
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